9 research outputs found

    MADiff: Offline Multi-agent Learning with Diffusion Models

    Full text link
    Diffusion model (DM), as a powerful generative model, recently achieved huge success in various scenarios including offline reinforcement learning, where the policy learns to conduct planning by generating trajectory in the online evaluation. However, despite the effectiveness shown for single-agent learning, it remains unclear how DMs can operate in multi-agent problems, where agents can hardly complete teamwork without good coordination by independently modeling each agent's trajectories. In this paper, we propose MADiff, a novel generative multi-agent learning framework to tackle this problem. MADiff is realized with an attention-based diffusion model to model the complex coordination among behaviors of multiple diffusion agents. To the best of our knowledge, MADiff is the first diffusion-based multi-agent offline RL framework, which behaves as both a decentralized policy and a centralized controller, which includes opponent modeling and can be used for multi-agent trajectory prediction. MADiff takes advantage of the powerful generative ability of diffusion while well-suited in modeling complex multi-agent interactions. Our experiments show the superior performance of MADiff compared to baseline algorithms in a range of multi-agent learning tasks.Comment: 17 pages, 7 figures, 4 table

    An Improved Genetic-Shuffled Frog-Leaping Algorithm for Permutation Flowshop Scheduling

    Get PDF
    Due to the NP-hard nature, the permutation flowshop scheduling problem (PFSSP) is a fundamental issue for Industry 4.0, especially under higher productivity, efficiency, and self-managing systems. This paper proposes an improved genetic-shuffled frog-leaping algorithm (IGSFLA) to solve the permutation flowshop scheduling problem. In the proposed IGSFLA, the optimal initial frog (individual) in the initialized group is generated according to the heuristic optimal-insert method with fitness constrain. The crossover mechanism is applied to both the subgroup and the global group to avoid the local optimal solutions and accelerate the evolution. To evolve the frogs with the same optimal fitness more outstanding, the disturbance mechanism is applied to obtain the optimal frog of the whole group at the initialization step and the optimal frog of the subgroup at the searching step. The mathematical model of PFSSP is established with the minimum production cycle (makespan) as the objective function, the fitness of frog is given, and the IGSFLA-based PFSSP is proposed. Experimental results have been given and analyzed, showing that IGSFLA not only provides the optimal scheduling performance but also converges effectively
    corecore